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ABSTRACT 

We present a method to identify Ultra Faint Dwarf Galaxies (UFDGs) candidates 
in the halo of the Milky Way using the future Gaia catalogue and we explore its 
detection limits and completeness. The method is based on the Wavelet Trans¬ 
form and searches for over-densities in the combined space of sky coordinates 
and proper motions, using kinematics in the search for the first time. We test the 
method with a Gaia mock catalogue that has the Gaia Universe Model Snapshot 
(GUMS) as a background, and use a library of around 30000 UFDGs simulated 
as Plummer spheres with a single stellar population. For the UFDGs we use a 
wide range of structural and orbital parameters that go beyond the range spanned 
by real systems, where some UFDGs may remain undetected. We characterize 
the detection limits as function of the number of observable stars by Gaia in the 
UFDGs with respect to that of the background and their apparent sizes in the 
sky and proper motion planes. We find that the addition of proper motions in the 
search improves considerably the detections compared to a photometric survey at 
the same magnitude limit. Our experiments suggest that Gaia will be able to de¬ 
tect UFDGs that are similar to some of the known UFDGs even if the limit of Gaia 
is around 2 magnitudes brighter than that of SDSS, with the advantage of having a 
full-sky catalogue. We also see that Gaia could even find some UFDGs that have 
lower surface brightness than the SDSS limit. 

Key words: The Galaxy; halo, formation - galaxies: dwarf - dark matter - meth¬ 
ods: data analysis - Astronomy & Celestial Mechanics: astrometry 


1 INTRODUCTION 

The current cosmological cold dark matter paradigm posits 
the assemblage of large structures in the Universe from 
smaller ones (Press & Schechter 1974; White & Rees 1978; 
Springel, Frenk & White 2006). A galaxy like ours must 
have formed by the merger of a large number of smaller sys¬ 
tems, that even today, must be still in the process of being 
accreted. A discrepancy between the predicted and observed 
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number of galaxy satellites has given rise to the so called 
“missing satellite problem” (Klypin et al. 1999; Moore et al. 
1999). However, in recent years, an entirely new population 
of hitherto unknown systems with very low luminosity and 
surface brightness, dominated by dark matter, the so called 
“Ultra Faint Dwarf Galaxies” (UFDGs), has been discov¬ 
ered, opening up the possibility of resolving this problem 
(e.g. Simon & Geha 2007; Bullock 2010). The knowledge 
of their structural properties, chemical abundances and stel¬ 
lar populations is also key to understanding fundamental is¬ 
sues (see review by Belokurov 2013) like the process of star 
formation and the role of feedback in these relatively low- 
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mass environments (Brown et al. 2014); how to distinguish 
between a dwarf galaxy and a globular cluster in some ex¬ 
treme cases (e.g. Segue 1 and 2, Willman 1, Boo 11 and 
CmB, Forbes & Kroupa 2011); or to what extent UFDGs 
could have contributed to the stellar population found in the 
Galactic halo today (e.g. Kirby et al. 2008). 

So far, all known UFDGs were discovered as over¬ 
densities in deep large-area photometric surveys, the vast 
majority in SDSS (e.g. Willman et al. 2005a,b; Zucker et al. 
2006a,b; Belokurov et al. 2006; Belokurov et al. 2007; 
Belokurov et al. 2008, 2009, 2010), together with recent 
findings in Pan-STARRS (Laevens et al. 2015) and the Dark 
Energy Survey (DBS, Koposov et al. 2015; Bechtol et al. 
2015). 

The ESA Gaia mission, launched in December 2013, 
offers excellent prospects for the discovery of new mem¬ 
bers of the UEDG population. Gaia (de Bruijne 2012; 
Perryman et al. 2001) will measure accurate positions, par¬ 
allaxes and proper motions for all stars out to its survey limit 
of G = 20 (V = 20-22, depending on the colour of the 
source), where G is the white light photometric pass-band 
of Gaia (Jordi et al. 2010). Multi-colour photometry will be 
obtained for all stars and radial velocities will be collected 
for stars at Grvs <16 mag, where Grvs indicates the pass- 
band of the Radial Velocity Spectrograph on-board. Gaia 
will also provide astrophysical information on all the sources 
observed, primarily through multi-colour photometry. The 
astrophysical parameters of all Gaia sources will be pro¬ 
vided as part of the survey data products (Bailer-Jones et al. 
2013). 

Although the Gaia survey is not as deep as SDSS, Pan- 
STARRS or DES, it is all sky at a spatial resolution com¬ 
parable to that of Hubble Space Telescope (HST), and will 
deliver high accuracy astrometry (positions and proper mo¬ 
tions) for all sources. The combination of these unique fea¬ 
tures is what makes the comparatively shallow survey of 
Gaia potentially powerful in the search for UFDGs. Here 
we aim to exploit this in a technique to identify UFDGs. 
The combination of positions and kinematics has proven 
to be most efficient in the search for dark matter subhalos 
in cosmological simulations (e.g. Behroozi, Wechsler & Wu 
2013; Onions et al. 2012). But, to our knowledge, this is the 
first time that both configuration space and kinematics are 
included in the search of UFDGs. As we will show, Gaia 
will enable us to probe parts of the UFDG parameter space 
which have not been covered before, and will allow for a 
comprehensive study of the spatial distribution around the 
Milky Way (MW) of this faint galaxy population. 

The present work continues the series 
(Brown, Velazquez & Aguilar 2005; Mateu et al. 2011), in 
which we have assumed the task of building ever more 
realistic Gaia mock catalogues, and used them to test 
tools that we have introduced to detect and characterize 
substructure in the stellar halo of our Galaxy. 

In Section 2 we introduce our Gaia mock catalogue, 
which serves as our laboratory to study the detectability of 
UFDGs. This includes a stellar background and our syn¬ 
thetic UFDGs. The details of the Gaia selection function 
and error model used to generate the Gaia observables are 
described as well. In Section 3, we present our detection 
tool, which consists of a peak identifier that is applied in the 
sky and proper motion planes, a cross-matcher that identi¬ 


fies peaks with common members in both planes, and a pro¬ 
cedure to evaluate the statistical significance of the matched 
peaks. Section 4 presents our results. Detection limits are 
shown as a function of astrophysical parameters and of “ef¬ 
fective parameters”, namely a combination of the former on 
which our detection method depends directly. In Section 5 
we summarize the limits of our method as well as the as¬ 
sumptions that it is based on. Our conclusions are presented 
in Section 6. 


2 THE GAIA MOCK CATALOGUE 

Our Gaia mock catalogue is the stage where we assess the 
success, efficiency and limits of our UFDGs detection tech¬ 
nique. As such, it represents a controlled, but realistic envi¬ 
ronment. There are several elements that compose it. First, 
we need a model of the Galaxy, from which suitable stel¬ 
lar backgrounds can be extracted (Section 2.1). We also 
need a mass model for our synthetic UFDGs and a stellar 
population model (Section 2.2). The latter is because our 
UFDGs are not merely ensembles of particles, but stellar 
properties must be assigned to them, as they impinge on 
the value and quality of their Gaia observables. The pre¬ 
vious elements allow us to assemble an extensive library 
of UFDGs at various distances and with a wide range of 
intrinsic parameters, projected against stellar backgrounds 
at several Galactic latitudes. We then use a Gaia selec¬ 
tion function and error model to transform the theoretical 
quantities into realistic Gaia observables, as our detection 
method should work based on them only (Section 2.3). In 
Section 2.4 we present the filtering method that we use to 
eliminate foreground stars. Finally, we examine the nature of 
the UFDGs projections in the sky and proper motion planes, 
as these are the basic input variables that our method works 
on (Section 2.5). 

2.1 The Galactic Background Model 

We use as a Galactic background the Gaia Universe Model 
Snapshot (GUMS) from Robin et al. (2012), which is a sim¬ 
ulated catalogue of the sources expected to be observed by 
Gaia, at a fixed epoch. It includes the simulation of Galactic 
sources, Solar system and extragalactic objects. 

We note here that Gaia will observed large numbers 
(potentially millions) of galaxies and about half a million 
QSOs (de Bruijne et al. 2015; de Souza et al. 2014), which 
will all appear as faint point sources and could thus compli¬ 
cate the search for UFDGs. However, discrete source classi¬ 
fication will be part of the published data (Bailer-Jones et al. 
2013) and in this work we assume that we can rely on this 
to filter out galaxies and QSOs (but see Sect. 5). Therefore, 
we use only Galactic sources, and restrict the catalogue to a 
range in latitude of 20° < \b\ < 90°, to avoid the crowding 
and high extinction expected near the Galactic plane. 

The Galactic sources in GUMS are generated based on 
the Besanfon Galactic Model, which includes the Galactic 
Thin and Thick Disks, Bulge and Halo, based on appropri¬ 
ate density laws, kinematics, star formation histories, enrich¬ 
ment laws, initial mass function (IMF) and total luminosities 
for each of the component populations, described in detail 
in Robin etal. (2012). Objects are simulated with masses 
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down to the hydrogen hunting limit, corresponding to spec¬ 
tral types down to ~ L5. Binary and multiple star systems 
are also simulated (see details in Arenou 2011), introduced 
with a probability that depends on the mass and evolutionary 
state of the primary star. The probability distribution for the 
separations is assumed to be a log-normal with the parame¬ 
ters reported by Duquennoy & Mayor (1991) for (primary) 
stars down to solar masses, and from Close et al. (2003) for 
low mass stars. 


2.2 The UFDG Model 

2.2.1 The Dynamical Model 

For our basic synthetic UFDG dynamical model we use a 
simple Plummer sphere with isotropic velocity distribution. 
A particular realization of this model is uniquely defined by 
two of the following parameters: 


• Mj\ Total mass 

• To: Core radius 

• rh! Half-mass radius 

• ay\ Velocity dispersion (3-D) 


The total gravitational binding energy of a Plummer 
sphere is: 


Ml 




Wp = -f^, where / = —. 

32 


By virial equilibrium, we can establish a relation between 
the total mass, core radius and velocity dispersion: 


2K = -W 


ro 


In appropriate astronomical units, the previous relation is: 



0.03559 


(Mr/Me) 

(ro/vc) 


Observationally, the scale-length usually reported is 
the half-light radius. Under the assumption of a position in¬ 
dependent mass-to-light ratio (i.e. well mixed), the half¬ 
mass radius and its light counterpart coincide. We assume 
this and use /"h indistinctly as the half-light, or the half-mass 
radius. For a Plummer sphere, the relation between the core 
and half-mass radius is: rn = 1.30477 r^. Then, the previous 
relation between velocity dispersion, mass and radius can be 
written as 


( 


O-y \ 
km s“* / 


0.04066 


(Mt/ Me) 

(rh/ pc) 


( 1 ) 


For a model with a given ri, and cry, the mass-to-light 
ratio MIL of the UFDG is given by the ratio of its total mass 
My, derived from Eq. 1, and the chosen total V-band stel¬ 
lar luminosity Ly. The number of particles in the realization 
(Ns) and the total stellar mass (Ms) are a consequence of the 
assumed total luminosity, star formation history and stellar 
mass function. 


2.2.2 The stellar population model 

We simulate the stellar population of the UFDGs as a 
single star formation burst with an age of 12 Gyr and 



Figure 1. Hess diagram Mq vs intrinsic V-I colour for the UFDG’s 
stellar population model (see text for details). The colour scale is 
proportional to the logarithmic number of stars in each bin. The right 
y-axis indicates the maximum distance up to which a star with a 
given Mg will be observable by Gaia, given the expected magnitude 
limit of Glim = 20 (assuming Ay = 0). For the gray bands see 
discussion at end the of Section 2.3 


metallicity Z = 0.0001, assuming a Chabrier (2003) IMF. 
We use the HB13 Stellar Population Synthesis code from 
Hemandez-Perez & Bruzual (2013), which allows for a con¬ 
sistent treatment of isolated and binary stars. The prescrip¬ 
tions assumed in this code are similar, yet not identical, to 
those used for the statistical orbital properties of binaries in 
GUMS (see Section 2.1). In HB13, the binary probabilities 
and orbital parameters are randomly drawn and assigned to 
each primary star in the population at age zero and the evolu¬ 
tion is followed using the Hurley, Tout & Pols (2002) binary 
evolution code. Binary probabilities are assumed to depend 
on the mass of the primary using the prescription from Lada 
(2006), and the distribution of periods, and thus, separations, 
of Duquennoy & Mayor (1991). The resulting Mq versus in¬ 
trinsic V - 1 colour Hess diagram for the stellar population 
used for the UFDGs is shown in Fig. 1. 


2.2.3 Parameters of the simulated UFDGs 
Each of the simulated UEDG has 9 free parameters: 

(i) intrinsic parameters: 

• Total V-band luminosity: Ly 

• Half-light radius: rn 

• Velocity dispersion: ay 

(ii) extrinsic parameters: 

• Heliocentric distance: D 

• Position in the sky (I, b) 

• Galactocentric velocity vector modulus: Vgai 

• Azimuthal and latitudinal orientation angles of the 
galactocentric velocity vector: fy, 6y 

We have generated a set of libraries with a total of 
~30 000 UEDGs covering large ranges of the 9 parameters 
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Table 1. Ranges of parameters in the simulated UFDGs (first row) and parameters of the UFDG used as our fiducial case. See Sections 2.2.3 
and 2.3 for definitions. Nobs and M/L can be obtained from the other parameters. 



Lv 

Cb 

(Ty 

D 

1 

b 


fv 

By 

M/L 

M, 

^obs 


(Lo) 

(pc) 

(kms^*) 

(kpc) 

n 

n 

(kms-‘) 

(°) 

(°) 

(Mo/Lo) 

(Mo) 


ranges 

86.-7.5 X 10** 

5-4000 

1-500 

10-250 

0-180 

0-90 

24-550 

0-360 

-90-90 

0.04-3.9 X 10^ 

10-1.6 X 10** 

10-10^ 

fiducial 

5 X 10^ 

80 

10 

20 

90 

30 

453 

0 

0 

902 

6x 10^ 

94 




Figure 2. Left: a-y vs. rn. Right: M/L vs. Ly. The yellow dots correspond to members of our UFDG library. Note that this library contains 
only systems with at least 10 stars observable by Gaia. Solid lines indicate constant total-mass models. Known UFDGs and classical dwaif 
spheroidal galaxies are shown with green diamonds and blue squares respectively (data from McConnachie 2012). The labels correspond to: 
Sgr (1), For (2), Leol (3), Scl (4), Leoll (5), Sex (6), Car (7), UMi (8), Dra (9) for blue squares and CVnI (1), Her (2), Boo (3), UMa (4), LeoIV 
(5), CVnII (6), UMall (7), CmB (8), BooII (9), Will (10), Segl (11), Segll (12), LeoV (13) for green diamonds. We do not include LeoT, which 
would not be observable with Gaia, and Pscll that lacks measurements on some parameters. 


(see Table 1). Our main library is generated with the fol¬ 
lowing parameters drawn at random: i) the number of stars 
that would be observable by Gaia Nobs, ii) the heliocentric 
distance D, iii) the apparent size of the UFDGs in the sky 
8 and iv) in the proper motion plane Ayu, and v) the center- 
of-mass velocity. These quantities are described in detail in 
Sections 2.3 and 2.5. The first four parameters are generated 
from a uniform distribution in a logarithmic scale. For the 
last one, the angles (py and 8y and the modulus Vgai are gen¬ 
erated following a uniform distribution, with Fgai between 
zero and the local escape velocity for the Galaxy*. The re¬ 
maining parameters (namely Ly, /-h and cry) are obtained 
from the ones above. We also require Nobs to be at least 10 
in this library. This library is designed with particular goals 
described in detail in Sections 4.2 and 4.3. 

Fig. 2 illustrates the range explored in half-light radii 
and velocity dispersion (left panel), as well as in mass-to- 
light ratio and total V-band luminosity (right). The observed 
values of these parameters for known UFDGs and classi¬ 
cal dSph galaxies are shown with green diamonds and blue 
squares, respectively. Note that the range explored by our 
synthetic library (yellow dots) is much larger than the ob¬ 
served one for Vb, cry and M/L. In particular, the large range 

* We compute the escape velocity as 14 = Vc yj2(l - ln(Rgai/r,)), 
with Vc = 200 km s“* and r, = 200 kpc. 


covered in cry results in a very large range of M/L (the 
library spans an even larger range of M/L than shown in 
Fig. 2). We are pushing the limits of the parameter space ex¬ 
plored, towards regions where the detection would be obser- 
vationally more difficult, i.e. towards larger Vb and ay (top 
and right areas of left panel), and lower luminosity and high 
M/L (top and left areas of right panel). The fact that Nobs is 
generated uniformly, together with the large scatter in lumi¬ 
nosity for small Nobs due to stochastic effects, produces the 
diffuse boundary in Ly in the right panel. 


2.3 The Gaia selection function and error model 

Here we present our model for the Gaia observations that in¬ 
cludes the selection function and the Gaia error model that 
we apply to the GUMS model and the simulated UFDGs. 
The Gaia observables are the 5 astrometric parameters (/, b, 
m. 111 , nt), the radial velocity, the Gaia photometry (includ¬ 
ing the G Gaia magnitude and the two broad band magni¬ 
tudes Gbp and Grp). The final Gaia catalogue will also pro¬ 
vide three atmospheric parameters (metallicity, surface grav¬ 
ity and effective temperature) and extinction. The true values 
for these observables and parameters are obtained directly 
from the models. The conversion from the Johnson-Cousins 
photometric system to Gaia magnitudes is done following 
the transformation given in table 3 from lordi et al. (2010). 
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We do not consider extinction because all fields used in our 
study are at relatively high latitudes (at least 30°). 

The GUMS model and the simulated UFDGs include 
binary and multiple systems. To determine which ones will 
be resolved by Gaia, we use a prescription used within the 
Data Processing and Analysis Consortium^. In this model 
the minimum angular separation on the sky that Gaia can 
resolve depends on the apparent magnitudes of the stars in 
the system, with the minimum separation being ~ 38 mas. 
For the unresolved cases, a single detection is considered by 
computing the total integrated magnitude, averaging posi¬ 
tions and taking the atmospheric parameters (such as surface 
gravity) of the primary star in the system. 

As an example, if we take a field^ of 2 x 2° centered 
at I = 90° and b = 30°, there are initially 25 521 objects, 
from which 57 per cent are single stars, 13 per cent are stars 
of resolved multiple systems and 30 per cent are unresolved 
systems'*. For a simulated UFDG at 50 kpc, these fractions 
are 63, 6 and 31 per cent, respectively. 

To simulate Gaia -like errors for the GUMS catalogue 
and the simulated UFDGs, we use the code presented in 
Romero-Gomez et al. (2015), updated to the post-launch 
performance^ as described in de Bruijne, Rygl & Antoja 
(2015). Up to date information is available from the Gaia 
web pages®. The uncertainties on the astrometry, photometry 
and spectroscopy are mainly functions of the magnitude and 
colour. The geometrical factors and the effect of the num¬ 
ber of passages due to the scanning law are also taken into 
account^. For the surface gravity we take a constant error 
of 0.25 dex, based on table 4 of Bailer-Jones et al. (2013). 
Lacking a model of the Gaia performances for unresolved 
systems, we use the same prescriptions as for single or re¬ 
solved stars. Only stars with magnitude G < 20 (the Gaia 
magnitude limit) are considered. 

From all the Gaia astrometric observables, we can not 
make use of parallaxes to infer distances to UFDGs stars. 
The median relative error in parallax of the stars in the 
UFDGs in the range of distances considered here is at least 
of 70 per cent and on average 170 per cent, since they can 
be very faint and distant objects. Besides, radial velocities 
are not available for most of the cases as 90 per cent of the 
UFDGs in the range of distances explored here have at most 
10 per cent of stars that are brighter than the magnitude limit 
of the Gaia spectrograph (Grvs = 16). Therefore, we use as 


^ http: //WWW. cosmos.esa.int/web/gaia/dpac 
(Mignard et al. 2008) 

^ In what follows, we always work with 2x2° fields. To cover 
the same solid angle, regardless of latitude, we have converted the 
Galactic longitude of the stars I, to /' = (/- /q) * cos(fio) + h, where 
/o and bo are the longitude and latitude of the center of the field, 
respectively. For simplicity, we use I instead of I' hereafter. 

'* After the cuts in parallax and surface gravity (see Section 2.4), 
these fractions become 54, 5 and 41 per cent, respectively. The rela¬ 
tive increase of unresolved systems is because we are selecting large 
distances and giant stars (dwarf stars have been removed), which 
have higher binary fractions. 

® The code was released at the 2"“* Gaia Chal¬ 
lenge Workshop and is publicly available at 
https://github.com/mromerog/Gaia-errors 
® http: //WWW. cosmos.esa.int/web/gaia/science-per£ormance 
^ http: //WWW. cosmos.esa.int/web/gaia/table-6 



Figure 3. Number of stars observed by Gaia Nohs, as a function of 
the total V-band luminosity and distance of the UFDG. 

our observables only the two angular positions in the sky {I 
and b) and the two proper motions (jxi, = fit cos(b) and /rj,). 
The errors in the angular coordinates in the sky are of the 
order of 0.05-0.4 mas whereas in proper motion these are 
about 0.03-0.3 mas/yr. 

The number of UFDG stars seen by Gaia Nobs (not 
the same as the total number of stars in the realization Ns) 
is determined by the total stellar luminosity Ly of the sys¬ 
tem and the distance of the UFDG (given an assumed stel¬ 
lar population model). On the right axis of Fig. 1 we indi¬ 
cate the distance limit associated to the Gaia deepest mag¬ 
nitude G = 20. Note that at distances larger than 25 kpc 
only giant stars are observed. In Fig. 3 we show the number 
of UFDG stars observed by Gaia as function of luminos¬ 
ity and distance. For instance, UFDGs of luminosity around 
1000 Lq have no stars bright enough to be observed by Gaia 
beyond ~ 40 kpc but they will have around 15 observable 
stars around 23 kpc. The oscillations with distance present 
at around 25, 60 and 120 kpc are because the type of stars 
of the UFDG population that Gaia can detect changes as it 
is observed at different distances, depending on whether or 
not features like the main sequence turn-off are observable. 
These distances have been marked in the Hess diagram of 
Fig. 1 and they correspond to the main sequence turn-off, the 
extreme horizontal branch and the horizontal branch, respec¬ 
tively. Note also the stochasticity around small numbers of 
observable stars. 

The left panel of Fig. 4 shows the range covered by 
our library in number of stars observable by Gaia Nabs and 
distance D to the Sun. To include the known UFDGs and 
classical dSphs in this plot we have computed Nabs assuming 
the stellar population model described in Section 2.2.2, and 
the total luminosity and distance reported by McConnachie 
(2012) for these systems*. Here we can see that there are 
real systems that go beyond the range covered by our library 
towards small number of observed stars. We must remember 
that Aobs is the number of stars that would be seen by Gaia, 
which has a ~ 2 mag brighter limit than the SDSS * used to 

* The SFH assumed in the stellar population model is reasonably 
representative of the SFH of known UFDGs. For simplicity we as¬ 
sume the same model for the classical dwarfs to get a rough estimate 
of Nobs, although these have very different SFHs. 

® The SDSS survey is 2 mag deeper comparing the r and G bands, 
or between 1 and 2 mag deeper comparing the g and G bands. This 
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Figure 4. Left: D vs. Nahs- Right'. Apparent size A/r in the proper motion plane vs. apparent size 6 in the sky. The yellow dots show the range 
spanned by the synthetic UFDG library. Known UFDGs and classical dwarf spheroidal galaxies are shown with green diamonds and blue 
squares respectively (data from McConnachie 2012). The labels are as in Fig 2. 


identify those systems (e.g. Belokurov et al. 2007). This is a 
limitation imposed by Gaia that we cannot get around. Note 
also that the boundaries of the regions spanned by the library 
in this panel are sharp by construction (see Section 2.2.3). 


larger than that of the UFDGs. It is our experience that it is 
difficult to devise a unique algorithm that can identify our 
target systems at all distances, and so, limits are introduced 
as a necessary compromise. It is clear that specifically tai¬ 
lored algorithms could be used for nearby cases. 


2.4 Filtering the foreground 

Along a given line-of-sight (LOS), it is important to min¬ 
imize the number of background stars'® Abg with respect 
to the number of stars in the UFDG. We use a parallax cut 
to filter out foreground disk stars, which have large paral¬ 
laxes with small errors. Thus, we discard data for stars with 
CT - > 0.1 mas, i.e. an observed parallax which, within 

the errors, corresponds to distances smaller than lOkpc. We 
also filter out foreground disc dwarfs with the implemen¬ 
tation of a surface gravity logg cut: we discard stars with 
logg - eiogj > 4, where logg is the atmospheric parameter 
derived from the Gaia observables. With these two cuts, we 
reduce N^q typically by an order of magnitude. For instance, 
there were 25 521 stars in the GUMS model in our fiducial 
field (/ = 90° and b = 30°) and with the cuts we reduce this 
number to A^bg^I 413. 

We stress that these cuts have been designed to have 
minimal loss of observable stars from the UFDGs, partic¬ 
ularly at relatively large distances {D > lOkpc), for which 
dwarf stars will not be observable by Gaia. The fraction of 
lost stars is up to 70 per cent for nearby UFDGs at lOkpc. 
However, it goes down to 30 per cent at ~ 18 kpc and is less 
than 10 per cent for distances larger than 25 kpc. Neverthe¬ 
less, with the cuts we are maximizing the relative fraction 
of UFDG stars with respect to the background in all cases, 
given that the fraction of stars lost from the background is 


is estimated by taking the two extreme colours of the stars in our 
simulated UFDGs, that is V - / =0.25 and V - I =1.5, and convert 
these to G - r and G - g colours following Jordi et al. (2010). 

*® The Galactic sources in GUMS are actually foreground and 
background stars. We use hereafter “background” for simplicity. 


2.5 The sky and proper motion planes 


The starting point of our detection procedure (Section 3) is 
the UFDGs projections in the sky and proper motion planes, 
thus it is essential to understand the behaviour of these pro¬ 
jections of the systems and the background. 

Fig. 5 shows the stars in the customary 2° x 2° field 
of view of our fiducial simulated system in the sky (top) 
and proper motion planes (bottom). The parameters of this 
system are listed in Table 1. The stars belonging to the 
UFDG are coloured in green, while the background stars are 
in black. This system is hardly seen in the sky plane because 
it is very diffuse. But note how it is much more compact 
in the other plane. The compactness of the UFDGs in the 
proper motion plane is a general characteristic of most of 
our simulated UFDGs that improves considerably our search 
(Section 4), being a fundamental advantage of the Gaia data. 

Note also the very different nature of the background 
in these planes. In the sky the background is roughly con¬ 
stant, but becomes markedly non-uniform in the proper mo¬ 
tion plane. This requires a special treatment when assigning 
significance to the peaks (Section 3.3). 

The apparent sizes of an UFDG in the sky and proper 
motion planes are set by its intrinsic size and velocity disper¬ 
sion, combined with its distance from the Sun. These sizes 
can span a wide range in both planes. The half-light angular 
size is given by 


en ~ 0.0573 


'•h(pc) 
D( kpc) 


( 2 ) 


In our synthetic library, rn varies between (see Table 1)5 and 
4 000 pc and D between 10 and 250 kpc (this is the approx¬ 
imate distance limit to detect at least '-lO stars with Gaia, 
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|i|. (mas/yr) 

Figure 5. Sky (top) and proper motion plane {bottom) for the field 
of our fiducial UFDG in Table 1. The stars belonging to the system 
are shown as green dots while the background stars are in black. 


for a luminous UFDG with Ly ~ 7 x 10“* Lq). This implies a 
range of apparent angular sizes of [4”, 23°]. 

In the proper motion plane, the apparent size is 


A;t( mas/yr) ~ 0.211 


cry(kms ') 
D(kpc) 


(3) 


For values of cry in the range [1,500] km s“' and again D 
between 10 and 250 kpc, we end up with a range of apparent 
sizes of [0.0008,11] mas/yr (but see below). 

The right panel of Fig. 4 shows the sizes spanned in the 
sky and proper motion planes by the UFDGs in our library. 
As we will see in Section 3, these two parameters are the 
most important, together with the number of visible stars in 
the UFDG, in determining the detectability of the system. 
We can see here that our library extends well beyond the 
spread covered by real systems. Note again that the bound¬ 
aries of the library are sharp in this panel, as it is generated 
with apparent parameters drawn at random from a uniform 
distribution in a logarithmic scale. 

It is also important to note that the apparent size of an 
UFDG in the proper motion space is greatly influenced by 
the observational errors. To illustrate this, we use a set of 
1,300 simulated UFDGs located at different distances and 
with velocity dispersions between 15 and 25kms“'. Each 
black dot in the top panel of Fig. 6 shows the error in pi, 
(similar for yu*) in each simulated UFDG computed as the 



Figure 6. Top: median error in /if, for ~ 1300 synthetic 
UFDG with velocity dispersions around 20kms“' (15 < uy < 
25kms“*) at different heliocentric distance. Bottom: dispersion 
in the /jf, proper motion of the same set of synthetic 
UFDG (red dots). The black curve shows the median error in pe, 
calculated in logarithmic bins from the top panel. The error bars 
correspond to the standard deviation. The blue line is the expected 
size according to Eq. 3 for cry = 20kms“*. 


median of all the individual stars errors in each UFDG. The 
proper motion error slightly increases with distance, as one 
would naively expect due to the fainter magnitudes. But the 
error also oscillates with distance. This is because the Gaia 
performances depend on the magnitude and colour of the 
star and the type of stars in the UFDG that Gaia can de¬ 
tect, and thus the fraction of stars with certain magnitudes 
and colours, changes as it is observed at different distances 
(as seen in Section 2.3). One can see that the error has min¬ 
ima around 30, 70 and 135 kpc. These are distances slightly 
larger than the ones at which a sudden increase in the num¬ 
ber of stars of certain types occurs. They correspond to the 
main sequence turn-off, the extreme horizontal branch and 
the horizontal branch, as discussed previously (gray shaded 
stripes in Fig. 1). 

The bottom panel of Fig. 6 shows the real size in the 
proper motion plane (red dots) computed as the standard de¬ 
viation of the proper motion coordinate pf, {a^ = of 
the stars in each UFDG. We also overplot the error in pi, 

) at each distance (black curve) taken as the median er¬ 
ror in logarithmic bins from the top panel. The blue line in 
this plot shows the expected size according to Eq. 3 for a ve¬ 
locity dispersion of 20kms“'. We see that the sizes of the 
UEDGs decrease up to ~ 40 kpc and for larger distances 
they follow the oscillations due to the Gaia errors. Once 
the size of the UEDGs is dominated by the observational er¬ 
ror, the apparent size oscillates between 0.1 and 0.2 mas/yr. 
Eor smaller velocity dispersions, e.g. ~ 5 km s“', the errors 
dominate already at a distance of 10 kpc. Therefore, the min¬ 
imum apparent size of the UEDGs is set by the observa- 
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tional errors. For the range of parameters explored here, this 
is above 0.1 mas/yr in 99.5 per cent of the cases. In what 
follows we take cr^ instead of A/r as a better measure of the 
apparent size of the UFDG in proper motion space. 

As seen in Section 2.3, the errors in the angular coordi¬ 
nates in the sky are around 0.05-0.4 mas, which is negligible 
compared to the apparent sizes of the UFDGs. For this rea¬ 
son, we do not observe a similar elfect in the sky plane. 


3 THE DETECTION TOOLS 

In this section, we present all the different elements that 
compose our detection method to identify UFDG candidates 
against the background. 

Our strategy is as follows. We consider fields of view 
of 2° X 2° in the sky. We first detect over-densities inde¬ 
pendently in the sky and the corresponding proper motion 
planes. For this we use the Wavelet Transform (WT, see 
Section 3.1). We do this for over-densities of different am¬ 
plitudes and sizes, and keep the most significant ones. Af¬ 
ter this, we perform what we call the cross-match of peaks 
(Section 3.2). This consists in counting how many stars be¬ 
long simultaneously to a certain peak in the sky and to a 
certain peak in proper motion space. We do this for all pairs 
of peaks of any size between both spaces. For each cross¬ 
match, we finally compute the probability that the observed 
number of common stars is just a coincidence (Section 3.3 
to 3.5). Cross-matches with a low probability are selected 
as possible UFDG candidates. Below we detail each of the 
steps of our method. 

One may wonder why this separate treatment for the 
sky and proper motion planes. After all, what we are look¬ 
ing for is a single peak in the 4-D space of positions in the 
sky and proper motion planes. This is because of the very 
different nature of these two planes, which results in the im¬ 
possibility of having a natural metric in the combined space. 
Any metric will imply the introduction of an arbitrary di¬ 
mensional scale which will limit the nature of the systems 
found. This is why we have preferred to work on the sky 
and proper motion planes separately, and then use the cross 
match procedure to relate peaks. The peaks that we do iden¬ 
tify correspond to single peaks in the combined 4-D space, 
but not necessarily using a unique metric, as our combina¬ 
tion of different wavelet scales in both planes allows for a 
larger range of identified peaks than if using a single metric. 

Although the whole detection process might seem com¬ 
plex, it is quite straighforward from the computational point 
of view. The entire algorithm takes a total of 40 s to run for 
our fiducial field of 2° x 2° in a single Intel(R) Core(TM) i7- 
3770 CPU @3.40GHz. This might change depending on the 
LOS but, as a first approximation, the celestial sphere above 
b = 30° would require 86 h of CPU time, which in fact can 
be spread into several CPU for different LOS. 

3.1 Wavelet analysis 

To detect over-densities in the sky and proper motion planes 
like the ones of Fig. 5, we use the WT (Starck & Murtagh 
2002). This can be thought of as a “localized” Fourier trans¬ 
form that gives information about certain frequencies and 
where in the image these frequencies are located. Due to the 


wide range of apparent sizes of our simulated UFDGs (see 
Section 2.5), our method needs to be able to detect over¬ 
densities of different sizes. In the application here a discrete 
set of frequencies (i.e. scales) are probed and we get infor¬ 
mation about the localization of those particular structures. 
We use here the a trous (“with holes”) variant of the WT 
(Starck Sc Murtagh 2002) which computes a discrete set of 
scale-related “views” of a 2-D function or image. We have 
previously used this technique to detect moving groups in 
the stellar velocity distribution of the Solar Neighborhood 
and surroundings (Antoja et al. 2008, 2012). To perform the 
calculations we use the MR software developed by CEA 
(Saclay, France) and Nice Observatory. 

Although the WT works at a specific scale, it can iden¬ 
tify over-densities within some range in size. Nevertheless, 
it is important to realize that we are probing a discrete set of 
scales in the images, and therefore, it is critical to choose 
those scales wisely. We explore 4 logarithmically spaced 
scales in each plane within the ranges found in Section 2.5. 
For the sky, as we are dealing with fields of 2° x 2°, we 
have chosen the scales (L05, Ckl, (F2 and (L4. Even though 
the higher scale puts a limit on the maximum size of an 
UFDG that can be detected in principle, the innermost parts 
of the more luminous UFDGs can still be detected, even if 
they have larger angular sizes. For the proper motion plane, 
we use scales of 0.12, 0.24, 0.48 and 0.96 mas/yr. Here, 
what we are missing are exceptional cases with extremely 
high velocity dispersion which are very close. 

An example of the WT planes in the sky for our fiducial 
UFDG in Table 1 is shown at the top part of Fig. 7, while the 
bottom part shows the proper motion plane. In each case, 
the four scales mentioned are shown. The blue colours are 
proportional to the values of the WT. 

After the WT, we search for relative maxima to de¬ 
tect the over-densities. The algorithm computes the Wavelet 
Probability (WP), that is the probability that the detected 
over-densities in the wavelet space are not due to Poisson 
noise. For this it uses a model for this type of noise in 
wavelet space. This is done by first using the Anscombe 
transform (Anscombe 1948) that converts a signal with Pois¬ 
son noise into Gaussian noise, for which the treatment in the 
WT planes is more straightforward (see Starck Sc Murtagh 
2002 and references therein). Here we will consider only 
WT peaks that have a WP of being real detections of WP ^ 
99.7 per cent (green crosses in Fig. 7), 95.4 ^ WP < 99.7 
(orange crosses) and 68.2 WP < 95.4 (red crosses), sim¬ 
ilar to >3cr, 2(T-3cr and \a-2cr significance levels in the 
Gaussian case, respectively. The size of the crosses in this 
figure indicates the size or scale that is being probed in each 
WT plane (also indicated in the top of the plots). There is an 
additional condition on the over-densities: they should have 
at least 5 stars to be considered a peak. 

The UFDGs are optimally detected (i.e. with higher 
WP) when the scales probed are similar to their apparent 
sizes. In the example of Fig. 7 a black circle indicates the 
position and extension of the UFDG in the sky plane and in 
the proper motion plane. In the sky images, for scales that 
are smaller than the apparent size of the UFDG (two first 
panels), some peaks are detected inside the region occupied 
by the UFDG but with low WP (Icr or 2a, red and orange 
crosses). For larger scales, which in this case are similar to 
the apparent size of the UFDG, the detection is above 3(T 


© 2007 RAS, MNRAS 000, 1-21 



Detection of UFDGs with Gaia 9 


WT scale: 0-05 deg WT scale: 0.10 deg WT scale: 0.20 deg WT scale: 0.40 deg 



WTscale: 0.12 mas/yr WT scale: 0.24 mas/yr WT scale: 0.48 mas/yr WT scale: 0.96 mas/yr 



Hi. (mas/yr) (mas/yr) n,. (mas/yr) n,. (mas/yr) 


Figure 7. Wavelet Transform (WT) at different scales for our fiducial UFDG for the sky (top) and proper motion planes (bottom). The black 
dashed circle shows the true position and size of the system in each plane. The circle in the proper motion plane is very small but can be 
seen better in the smaller scales. The position is calculated as the median of the coordinates (positions and proper motions) of the stars in the 
UFDG that are observed by Gaia, while the size of the circle is taken as the maximum between the standard deviation of the coordinates. Red, 
orange and green crosses indicate peaks at between 1 and 2, between 2 and 3, and > 3 cr significance, respectively. 


(green crosses). In other cases, the detection is always be¬ 
low 3(T or below 2<t because the UFDGs can be very diffuse 
in this plane, as already highlighted in Section 2.5. 

On the other hand, the UFDG is very compact in proper 
motion space, and it stands out as an over-density for all 
scales studied in the left part of the panels. In our exam¬ 
ple, the fiducial UFDG is always detected above 3cr (green 
crosses). In other cases, the best detection is for a particular 
scale that is close to the apparent size of the UFDG. 

Note also how in both planes, a number of low-WP 
random detections appear (most of red crosses in Fig. 7). 
Because of this, we need to discard false over-densities and 
keep only good UFDG candidates (Section 3.2). Besides, in 
the proper motion case, two over-densities with high WP are 
also detected in the center of the distribution for the two 
largest scales. These correspond to the peaks of the back¬ 
ground distribution, which as we have seen in Fig. 5, is not 
uniform. Note that the distribution of background stars in 
the proper motion plane will be different for each LOS and 
therefore, its centroid will shift to different positions in this 
plane. This does not occur for the sky plane which presents 
a uniform background. 

3.2 Cross-matching peaks in the two planes 

So far, we have detected peaks in the sky and proper motion 
planes, separately. However, contrary to false detections, an 
UFDG is an over-density in the 4-D combined space l-b-pc,- 
Pij. This is precisely the feature that we need to exploit to 
identify UFDGs, beyond what has been currently achieved. 


To do this, we list the stars contributing to the peaks 
identified separately in the sky and proper motion planes. 
By stars belonging to a certain peak, we mean those that are 
enclosed in a circle around the peak with the radius of the 
WT scale in the considered plane. In practice, because Gaia 
is a point source catalogue, we can identify the stars by their 
id number. Then we see whether a large fraction of these 
stars belong simultaneously to a certain peak in the sky and 
a peak in the proper motion plane. We call this “cross-match 
of peaks”. This cross-match is done for every peak and at 
every scale in the sky, compared to every peak at every scale 
in the proper motion plane. 

The computation of the probability of having this clus¬ 
ter of common stars occurring by chance is computed as de¬ 
scribed in Section 3.3. In Section 3.4 we explain how we 
filter out false detections. Because each UFDG can be de¬ 
tected in more than one scale, we also need to keep only 
independent detections. This is explained in Section 3.5. 


3.3 Assessing the probability of the detections 

Here we describe the statistics machinery that we devised to 
assess the probability of detection, i.e. compute which de¬ 
tections have a very low probability of occurring by chance. 

We are interested in P{NcoYi\{Ncom)), i-e. the probability 
of observing a certain number of common stars Ncom in a 
peak in the sky and a peak in the proper motion plane, given 
the expected number of common stars This probabil- 
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ity" is simply given by the Poisson probability distribution 
function 

P = P(Ncom\(Ncom}) = PoisSOn(A^eomKA^com»- (4) 

An estimate of (A^com) is given by: 

<iVcom) = (A^sky) I piue.,Pb) dfic.dnt, (5) 

Ja,, 

where (Ais^y) is the expected number of stars in the l — b peak 
and p{pi„pi,) describes the (normalized) number density of 
stars in the proper motion plane, both under the assumption 
that no UFDG is present. indicates the area of the peak 
over which we are integrating, which is a circle with a radius 
given by the WT scale, centered on the coordinates 

of the peak in question. For convenience, we use hereafter 
the logarithm of the probability. In P. 

For simplicity, we assume that the background density 
in the l-b plane is uniform, which is reasonable for the field 
size used, and therefore (A^sky) = A^BG(^''sky)MT> where Aj 
is the total area of the field in the sky plane (in our case 4 
deg^), fsky is the wavelet scale in the plane of the sky and 
A'bg is the number of background stars in the field. The lat¬ 
ter is computed from the observed data itself, by taking the 
8 fields adjacent to our problem field, with the same total 
area. For each of these fields we compute the total number 
of stars and we take the median. This is a better estimation 
of the number of background stars than the total number of 
stars in the considered field, specially in cases of luminous 
UFDGs that have a number of observed stars that is not neg¬ 
ligible compared to the number of background stars. 

In the proper motion plane, however, it is crucial to ac¬ 
count for the fact that the density is not constant and this 
is achieved by the integral term in Eq. 5, which, multiplied 
by (A^sky) gives the number of common stars expected to lie 
within the area of the detected peak. 

The distribution of stars in the proper motion plane 
p(jj.e,,Pb) is different depending upon the direction on the 
sky, so it must be computed independently for each field. 
We do this from the observed data itself, by taking the men¬ 
tioned 8 adjacent fields with the same total area. For each 
of these fields we compute the density as a (normalised) 
2-D histogram in the pi, - pi, plane with a pixel size of 
0.8 mas/yr. Taking these eight 2-D histograms, we compute 
the pixel-by-pixel median density to obtain a statistically re¬ 
liable estimate in each of the matrix cells, minimizing the 
effect of outliers. With this, we are assuming that the distri¬ 
bution of background proper motions remains similar among 
these adjacent fields. This is indeed the case. For instance, 
variations in the median proper motion in longitude and lat¬ 
itude among the adjacent fields are in general smaller than 
the pixel size. We numerically evaluate the integral in Eq. 5 
using the trapezoid rule in 2-D and bi-linear interpolation on 
the median density matrix. 

It could happen that one or various of the adjacent fields 
contains UFDGs. This would yield a wrong estimation of 
Abg and P(Pet,Pb)- The fact that we use the median of the 
8 fields helps to alleviate this issue. However, in case very 

’ * Do not confuse this probability for the combined sky and proper 
motion planes P, with the wavelet probability WP used in Sec¬ 
tion 3.1 


luminous UFDGs are present, our algorithm checks if the 
number of stars in one of the adjacent fields is significantly 
larger than in the others. This is done by checking that the 
dispersion in the number of stars in the 8 fields is not larger 
than 2.5 times the square root of the median. In this case, the 
algorithm could be re-run without the field in question. 

Instead of using the probability of Eq. 4, we can also 
use the significance j, defined as the number of times above 
the expected value of the distrihution, scaled to the disper¬ 
sion of the distribution 

A^com ~ (Alcorn) 

■5 =- , —• ( 6 ) 

V( Alcorn) 

The advantage of using s instead of In P is that s is positive 
and it increases for more relevant detections. 

3.4 Setting a threshold probability for detection 

We are only interested in those detections which have very 
low prohahility (very negative InF) and also a number of 
common stars'^ Ncom > (Alcom)- However, the central peak(s) 
of the background in the proper motion plane can appear also 
as a detection. This is because we estimate its expected num¬ 
ber of stars using the adjacent fields and any small fluctua¬ 
tion above this value can give a significant detection, though 
with a larger In P. To filter peaks that correspond to the back¬ 
ground and not to the UFDGs, we can conservatively select a 
relatively low (very negative) threshold value for In P, below 
which we consider detections to be relevant. However, we 
must realize that, as we lower the threshold value In Pthres. al¬ 
though we eliminate spurious peaks, we start losing relevant 
detections, so this is a compromise between false positives 
and losing bona fide peaks. 

We have explored this compromise on various LOS’s. 
In Table 2 we list the percentage of recovered UFDGs %rec 
and of false detections %faise from the total number of tested 
UFDGs , as a function of 5 different values for the threshold 
and 10 different LOS’s. For values above -9.0 there are sev¬ 
eral fields where the percentage of false detections is above 
20 per cent. For a threshold of -12.0, all false detections are 
at most 1.1 per cent. Although we could choose a In Fthres in 
between those two to make it the most optimal, we conser¬ 
vatively choose InFthres = -12. 

3.5 Independent Detections 

As explained before, the cross-match of peaks is done for 
all peaks at all scales and a given UFDG can be detected 
in more than one scale. This means that we need to select 
which of the many detections made in a given field, are in 
fact independent detections, i.e. different objects. 

We first organize all detections by increasing InF, 
choose the detection with the lowest InP and compare its 
l-b and pi, — pt, coordinates with the remaining detections. 
Now, we choose the next independent detection as the one 
with the lowest \nP that lies, both in the sky and proper 
motion planes, at a distance larger than the sum of the WT 

The last condition is required in order to select only over¬ 
densities but not under-densities. 
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Table 2. Detections, real and false, as percentage of the number of UFDGs used in each LOS. 
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Figure 8. In P as a function of the number of common stars A^com for 
our fiducial field and UFDG. The color scale is proportional to /of, 
i.e. the fraction of recovered stars from each UFDG. The dashed 
horizontal line mark the line below which we consider detections as 
relevant. The detection with the lowest In P (marked with a cross) is 
what we take as the best independent detection (see text for details). 

scales of the two detections (i.e. they do not overlap). We re¬ 
peat this procedure until we have gone through all the avail¬ 
able (relevant) detections. 

To illustrate the behaviour of In P and the selection of 
independent detections, in Fig. 8, we plot for our fiducial 
UFDG, In P as a function of the number of common stars 
Ncom for all detections in this field, that is the results of cross¬ 
matching all peaks at all scales in the sky and proper motion 
plane. In the plot we use a color scale proportional to /uf, 
defined as the number of stars from those Wom that truly be¬ 
long to the UFDG divided by the total number of stars orig¬ 
inally in the UFDG. In other words, /uf is the fraction of 
recovered stars from each UFDG. Dots below the horizon¬ 
tal dashed line are relevant detections, i.e. with In P below 
InPthrc.s (Section 3.4). 

There is a correlation of InP with As expected, 
detections with larger numbers of common stars have on av¬ 
erage lower In P. There is also a sequence that moves across 
the plot above the threshold. This corresponds to peaks in the 
proper motion background (note that they are black points, 
i.e. with no stars belonging to the UFDG). As explained be¬ 
fore, these detections are filtered by our threshold. 


Also, detections with the largest values of /uf have 
low In P, i.e. they are significant detections. However, as the 
number of stars in common increases, the value of In P de¬ 
creases, reaches a minimum and then increases again. The 
minimum value occurs for detections at the optimum scales 
in the sky and proper motion planes. It is in this case that 
a large fraction of UFDG stars lie in the detected peak in¬ 
side the WT scale, and the background is sufficiently low, 
so that the difference between the observed and expected 
number of common stars in the peak is maximal. Increasing 
the WT scale past the optimum values causes the inclusion 
of more stars of the UFDG in the peak but also more stars 
of the background that might not necessarily belong to both 
peaks in the sky and proper motion plane simultaneously, 
and hence, this causes InP to go back to larger values. This 
is a very convenient behaviour which allows us to select de¬ 
tections at the optimum WT scales. The detection with the 
lowest In P (marked with a cross) is what we take as the best 
(and in this case, only) independent detection. Finally, no¬ 
tice that in this example no false positives are picked up. 


4 RESULTS 

As we have seen in Section 2.2.3, we face a 9-D parameter 
space. Even with our library of more than 30000 different 
synthetic UFDGs, it is clear that we can cover only a limited 
amount of this vast hypervolume. 

To explore this space with some order, we will rely 
first on a series of carefully curated ensembles of cases. In 
each one, all parameters, except two, will be kept fixed (Sec¬ 
tion 4.1). This allows us to take 2-D sections of the original 
parameter space. Then we will identify in Section 4.2, a re¬ 
duced number of combinations of the original parameters 
that our detection procedure depends on directly, and which 
we call “elective parameters”. In Section 4.3, we explore the 
limits and completeness of our method in the space of ef¬ 
fective parameters, as well as in some of the most relevant 
original parameters. The effect introduced by changing the 
background level as we look at different DOS’s is discussed 
in Section 4.4. 
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Table 3. Values for the fixed parameters in the ensembles of UFDGs shown in Fig. 9. 
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4.1 The physical parameter space 

In the ensembles of tests presented here, we vary only 2 pa¬ 
rameters, keeping the other 7 parameters constant. The val¬ 
ues for the fixed parameters are shown in Table 3. 


4.1.1 The vs (Tv plane 

In the first test we use 625 synthetic UFDGs with varying rj, 
and (Ty. As indicated in Table 3, all UFDGs are located at a 
fixed position in the sky at / = 90°, b = 30°, at a heliocentric 
distance of 20kpc and have a luminosity of Lv = 5 x 10^ Lq. 
These are approximately the mean values for the observed 
UFDGs. The velocity dispersion varies logarithmically be¬ 
tween 3 and lOOkms^' and the half-light radius between 
5 pc and 4kpc. Due to stochastic variations in each realiza¬ 
tion, despite the luminosity and distance being constant, the 
number of detectable stars Nobs, varies between 38 and 207. 
The results of the test are shown in the top panel of Fig. 9. 
Each symbol (squares and crosses) in this plot corresponds 
to one simulated UFDG. The color scale in the panels is pro¬ 
portional to the detection significance (Eq. 6). Black crosses 
indicate UEDGs that were not detected. From this plot we 
can evaluate the detection limits as a function of Vb and cry. 

UFDGs with rn larger than ~ 600 pc are not detected 
(for this fixed distance and luminosity). This is because their 
apparent size in the sky is very big, making them extremely 
diffuse. In fact, = 700 pc results in an angular size equal 
to the sky fields that we are using for our analysis (2° x 2°). 
We also notice that for velocity dispersions below 20 km s“' 
the detection significance depends mainly on the half-light 
radius (vertical contours). This is because in this regime the 
apparent size of the UFDG in proper motion is in fact con¬ 
stant and set by the observational errors (Section 2.5). Above 
this velocity dispersion, the contours bend slightly to the 
left, meaning that for a given size in the sky, the detection 
is more significant for lower velocity dispersions. Note that 
we are exploring velocity dispersions up to ~ lOOkms^*, 
i.e. significantly larger than the typical velocity dispersion 
of CTv/ ~ 5 km s“' of known UFDG and classical dSph galax¬ 
ies (McConnachie 2012). 


4.1.2 The (Tv vx Fgai plane 

The middle panel of Fig. 9 is a test with 625 UFDGs, where 
the velocity dispersion cry and the modulus of the veloc¬ 
ity vector Vgai are varied. In this case, Fgai is varied linearly 
instead of logarithmically. Notice that in this ensemble, we 
only change the position and spread of the UFDG peak in 
the proper motion plane. In particular, we have chosen the 
values for the fixed velocity angles {Oy and (jty), so that the 
UFDG peak position moves horizontally across the proper 
motion plane as we vary Fgai, covering all possible contrasts 


between background and UFDG, and coinciding with the 
background peak for Vgai ~ 150 km s“'. 

Note how for a fixed value of Fgai, the best detections 
are the ones for lower velocity dispersions, which produce 
more concentrated peaks. Besides, something that immedi¬ 
ately stands out from this plote, compared with the others 
shown in Fig. 9, is the shallow variation in the detection sig¬ 
nificance across the entire part of this plane. At the horizon¬ 
tal region around Fgai ~ 150kms“', we see that the signifi¬ 
cance is the lowest, as we expected, but this is a very subtle 
effect. This lack of sensitivity indicates that, although they 
play a role, these two parameters (and specially Fgai) have 
little effect on the detectability of the UFDG. 

4.1.3 The Ly vs D plane 

The bottom panel of Fig. 9 shows the results of the exper¬ 
iment where luminosity and distance were varied between 
3 X 10^ and 5 x 10** Lq, and between 10 and 250 kpc, respec¬ 
tively. There are 509 UFDGs in this test. Their mass-to-light 
ratio M/L is between 20 and 4 x 10^. Their observable num¬ 
ber of stars Nobs varies between 2 and 6 000. Note that here 
we consider a lower value for the minimum Nobs than 10 as 
indicated in Table 1 to sample in detail the detection limit. 
We find, however, that the minimum number of Nobs that 
gives a positive detection is 5 for this particular example. 

In this test, there are several competing effects. For a 
fixed luminosity, as we increase the distance, the size in the 
sky and proper motion planes decreases, which favours iden¬ 
tification, but on the other hand, the number of visible stars 
also decreases, which makes identification harder. The first 
effect scales as cx 1 /D, while the second, being an individ¬ 
ual star luminosity problem, scales as oc XjTf-. So, at large 
distances the latter dominates and we lose the UFDGs, as 
seen in this panel. For instance, UFDGs with luminosities 
around 10** Lq are not detected beyond ~ 100 kpc. Also, 
given a fixed distance, more luminous objects are detected 
with higher significance. 

There is an interesting modulation in the colour con¬ 
tours in this panel. There are three leftward indentations of 
better detections at around 10, 25 and 60 kpc, which are bet¬ 
ter seen in the red and light-blue colours. These features are 
not statistical fluctuations, but the result of the effect of the 
Gaia observational errors in measured proper motions. As 
explained in Sect 2.5 (Eig. 6), the size of the UEDGs in 
proper motion plane changes with distance in a peculiar 
way, presenting several minima at around the mentioned dis¬ 
tances. At these distances, therefore, the UEDGs are slightly 
more concentrated in proper motion and, hence, easier to de¬ 
tect. The upper left part of the panel, which does not contain 
any coloured squares or black crosses, is the region where 
systems do not have stars that can be observed by Gaia. 

From this simple tests, one can see that some proper¬ 
ties of the UFDGs are more relevant for the detections. In 
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Figure 9. Detectability tests ran with several ensembles of 
UFDGs with only two varying parameters: rj, vs o-yitop), cry vs 
Vgal (middle), Ly vs distance D (bottom). The color scale indicates 
the detection significance s. UFDGs with significance over 200 have 
been plotted with a colour saturated at this value. Black crosses in¬ 
dicate UFDGs that were not detected. 


particular, the luminosity and distance, which set the num¬ 
ber of observable stars together with the apparent size in the 
sky, seem to have a larger impact on the significance of our 
detections, than the size in proper motion space and the po¬ 
sition of the peak with respect to the background. 


4.2 The “effective” parameter space 

If we look at the essence of our problem devoid of its as¬ 
tronomical context, our task is to identify common peaks in 
two different planes, subject to a noisy and not necessarily 
uniform background. Seen as such, the key parameters upon 
which a successful detection depends are: the height of the 
peaks compared with the background level and the spread 
of the peaks, that is the apparent sizes of the UFDGs in the 
sky 6 and proper motion planes cr^, and the number of ob¬ 
served stars that they are composed of A^obs. with respect to 
the background. The probability will also depend on the pro¬ 
jection of the center of mass velocity in the proper motion 
plane, since this determines how close the UFDG peak ap¬ 
pears to the center of the proper motions distribution, where 
the majority of the background contaminants lie. 

One can also see that for a certain UFDG that has been 
detected in the optimal scales (that is almost all stars in the 
UFDG are enclosed inside the joint peak detection), Ncom ~ 
A^obs + A^BG.in, where Afso.in is the number of background stars 
that fall inside the joint peak detection. Assuming that the 
number of background stars inside the joint peak is similar 
to the expected one, that is Abg.iii ~ (Ncom), the significance 
of Eq. 6 is equivalent to 

Alobs ... 

j- (7) 

V-^BG.in 

As AtgG.in NbgS~(t^, it follows from Eq. 7 that 
UFDGs that have the same ratio and all the rest of 

the parameters equal (including the LOS), have also ap¬ 
proximately the same significance'^. The same applies to 
UFDGs with the same ratio Nahslo'ii- For this reason, given 
a certain LOS (we deal with different LOS in Section 4.4) 
we can describe our detection problem based on these two 
quantities N^bsld and A^obs/c^,j, together with the position of 
the peak in proper motion space. We call these the “effective 
parameters”. The latter has, however, less relevance com¬ 
pared to other properties, as already seen. 

These quantities depend in turn on other astronomical 
parameters that characterize the system and its position with 
respect to the observer. But the successful detection of an 
UFDG depends only on a limited number of combinations 
of them, that is on the effective parameters. The importance 
of the effective parameters is that they reduce the dimension¬ 
ality of the parameter space where we need to determine the 
boundaries of successful detection of our procedure. In par¬ 
ticular, while we describe our UFDGs by using 9 physical 
parameters, the detection of these systems depends only on 
3 (and mainly 2) effective parameters. 

Here we test that this concept is indeed correct. To eval¬ 
uate the dependency of the detection limits and significance 
on these effective parameters, we have built a library of 2 000 


Note also that UFDGs with the same Nobs I or NobslF^ (appar¬ 
ent “surface” density) do not have the same significance. 
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Figure 10. Detectability tests ran with a library of synthetic 
UFDGs built with constant apparent size in the sky 9 and proper 
motion plane cr^, as well as projected center of mass velocity. Re¬ 
gions of constant effective parameters and ''si'" 

tical lines in this plot. The panel show the significance s as function 
of the physical parameters distance {left axis) and half-light radii 
(right axis) as a function of Mobs- UFDGs with significance over 
200 have been plotted with a colour saturated at this value. Black 
crosses indicate UFDGs that were not detected. 


UFDGs with varying A^obs, but keeping constant the apparent 
sizes in the sky and proper motion planes, as well as pro¬ 
jected center of mass velocity. Note that there is no straight¬ 
forward way of generating UFDGs with the same real size 
in proper motion space cr^ because of the effects of obser¬ 
vational errors in proper motion (Section 2.5), and therefore 
we do it approximately by generating UFDGs with constant 
A/r, assuming that Afi ~ cr^. Thus in this exercise we vary 
the physical parameters Zb, cr^ and D in a way that their com¬ 
bination (Eqs. 2 and 3) result in constant 9 and Ayu. We also 
change Vgai with distance in order to obtain the same proper 
motion peak for these UFDGs. 

Fig. 10 shows the significance of the UFDGs of this ex¬ 
periment as function of Nobs and distance D (left axis). The 
right axis shows the half-light radii Zb which is related to D 
through Eq. 2 to produce the same 9. Although we do not 
include them in this plot, one could also draw other axes for 
ay and Vgai, which are also related to D to produce the same 
A/r and the same peak position in proper motion space. As 
Q and Ayu are constant for all these UFDGs, the effective pa¬ 
rameters Nobs/S and Nabslcfii ~ A^obs/A/r are constant along 
vertical lines in this plot. We see that equal significance con¬ 
tours are approximately vertical (but see discussion below), 
which illustrates that indeed, for constant effective parame¬ 
ters the significance does not depend on the physical param¬ 
eters (D, (Tv, Hi) for these experiments (contrast this with the 
case of Fig. 9) but only on Nobs/^ and Nobs/Cyj- 

Flowever, the colours do not follow exactly vertical 
contours. This is due to the effects of observational er¬ 
rors in proper motion which make the apparent size of the 
UFDGs in the proper motion plane CTyj oscillate with D as 
already explained in Section 2.5 (Fig. 6). This is the same 
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Figure 11. Top: Detectability tests ran with a large library of syn¬ 
thetic UFDGs as function of two effective parameters: N„bs/9 and 
Nobs/cTyj. The color scale indicates the significance s. UFDGs with 
significance over 300 have been plotted with a colour saturated at 
this value. Black crosses indicate UFDGs that were not detected. 
Bottom: Significance of the detections of the same library but being 
these only detections in the sky plane (see text for details). 


effect as in Fig. 9 (bottom panel). Flere we generated these 
UFDGs with Ayu constant, but not In Fig. 10 we can see 
how UFDGs at around 30 kpc are better detected because at 
this distance the apparent size cr,, decreases. Although there 
is a similar effect around 70 and 135 kpc, these are not so 
clearly seen here. 


4.3 Limits of detection and completeness 

Having seen that UFDGs with the same effective parame¬ 
ters have the same significance, we can explore the detection 
limits and completeness as function of them alone. In the fol¬ 
lowing we show the significance of a large synthelic library 
(15 000 UFDGs) that spans a large range in the effective pa¬ 
rameters. All of the physical parameters of the UFDGs in 
this test are varied, except the LOS (/ and b). 

Fig. 11 (top panel) shows the results of this test as 
function of the two effective parameters NobsIS and Nobs/ffyi- 
The colours are well separated, i.e. not strongly mixed, in 
this plot, showing that despite the physical properties of the 
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Figure 12. Fraction Free of detected UFDGs as function of effective 
parameters. Black squares with a central white dot are regions where 
the fraction recovered is exactly equal to 0 (this is to differentiate 
from regions with small recovered fraction). The blue squares and 
the green diamonds show the estimated positions of classical dSphs 
and UFDGs, respectively, with labels as in Fig. 2. White contours 
indicate the detection significance from Fig. 11 (top). 

UFDGs being very different*'', the significance of the de¬ 
tections depends mainly of these two effective parameters. 
Though the third effective parameter (position of the peak in 
proper motion space) changes in the UFDGs of this test, the 
uniformity of the colours in a certain region of this plot indi¬ 
cates that its influence is not as relevant as that of the other 
parameters, as already shown. 

More in detail we can also see that for a higher frac¬ 
tion of the plot and specially the upper half, the colours fol¬ 
low a approximately vertical structure, i.e. the significance is 
mainly given by NobslO- For the lower part, the contours are 
more curved. Finally, note also how the undetected objects 
lie in the regions of low Nobs/d and/or low A'obs/'f/j, i-e. most 
diffuse objects. 

We find that the minimum significance of our positive 
detections is i 5. This is because of the threshold imposed 
to \nP in order to filter false detections (see Section 3.4). 

In Fig. 12 we use the same test described above to es¬ 
timate the fraction Free of detected UFDGs in each region 
of the effective parameter space. To do this, we have binned 
logarithmically this space and computed how many of the 
generated UFDGs in each bin are successfully detected. We 
only plot bins with at least 4 simulated UFDGs. The median 
number of UFDGs in each cell is 14. Note how for most of 
the space explored this fraction is close to 1 (ocher colours). 
There is a transition zone where fractions go from ~ 0.5 to 
0.2. To differentiate regions with small recovered fraction 
from regions with this fraction equal to 0 (all of them with 
dark colours), we have marked the latter with a white cen¬ 
tral dot. The region where our method is not able to detect 
objects is the low NobsIS and/or low Nobsl^, as expected. 

In this plot we superpose white contours indicating the 

*'' Remember that this is not a cross section as in Fig. 9. 


significance i of the detections from Fig. 11 (top), computed 
as the median significance in the same grid used in this plot, 
including the cases that were not detected, that is with j = 0, 
in the computation. Note that these contours are just approx¬ 
imate, given that there is some dispersion in the significance. 
For instance, around the contours of i = 5 and i = 10, the 
dispersion in s is of 5-10. We see how the transition zone 
corresponds to the region of significance roughly around 5. 
For significance higher than 10, the recovered fractions is 
between 0.7 and 1.0. 

We also plot in Fig. 12 the estimated values of effective 
parameters for the known MW satellites (classical dSph and 
UFDGs, blue squares and green diamonds, respectively). 
The values of cr^ for these know systems have been esti¬ 
mated by interpolating in a plot similar to Fig. 6 (bottom) 
but for velocity dispersions cry between 5 and 10 km s^*. As 
expected, all of the classical satellites of the MW lie in a re¬ 
gion of effective parameters where our algorithm applied to 
the mock Gaia data successfully detects all simulated sys¬ 
tems. Note that some of the classical satellites lie outside the 
higher limits of the plot. Out of the 13 UFDGs that would 
have observable stars by Gaia (Leo T would not be observ¬ 
able), 1 of them lies in a region where Free is 10 (Boo (3)) 
and 4 of them lie in regions with recovery fraction of 0.9 
(CVnI (1), Her (2), UMall (7), CmB (8)). Besides, Will (10), 
UMa (4), Segll (12) and BooII (9) are recovered with frac¬ 
tions of 0.8, 0.7, 0.6, 0.5, respectively. Segl (11) and LeoIV 
(5) are in regions with Free of 0.3 and 0.1, respectively. Fi¬ 
nally, CVnII (6) and LeoV (5) are outside the limits of detec¬ 
tion. Note that in this plot we see how systems with effective 
parameters similar to the known UFDG would be detected 
by our algorithm. But some of the known UFDG have a 
small number of observables stars (e.g. LeoIV (5) and Segll 
(12) have Vobs ~ 5 and ~ 7, respectively) and our tests are 
done for a minimum of Nobs ~ 10. Nevertheless, for these 
low luminosity known cases, the estimated Nobs is very un¬ 
certain. 

It is outstanding that our tests indicate that it is pos¬ 
sible to detect with Gaia UFDGs similar to some of the 
ones detected by SDSS which is 2 mag deeper. This is 
because, whereas the UFDGs of SDSS were detected with 
photometry alone, our search is done also in the proper mo¬ 
tion plane. The bottom panel of Fig. 11 is the same as the top 
panel, but plotting the significance that would correspond to 
these detections if the search had been made only in the sky 
plane, that is not including proper motion data in the detec¬ 
tion algorithm. The significance s is now calculated through 
■S = (A'sky - (Nsky))/yJiNsky), where Nsky is the number of 
stars in the detected peak in the sky. We only plot the sig¬ 
nificance for the detections that had at least*^ x = 3. The 
remaining simulated UFDGs are plotted with black crosses. 
The colours follow now vertical contours as the vertical axis 
plays no role. Comparing this plot with the top panel we 
see how much the significance increases when the proper 
motions are included in the search. Red colours (maximum 
significance) are only achieved for higher Nobs/d, that is for 
more densely populated objects in the sky (right part). Also 
the limits of detection are now located at larger Nobs/S- 

Fig. 13 shows the surface brightness of the detected 

Note that in the top panel the minimum i found was 5. 
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Figure 13. Surface brightness of the detected UFDGs as function 
of the effective parameters. Black dots indicate UFDGs that were 
not detected. Detected objects with surface brightness larger than 
30 mag/arsec^ are highlighted in red colours. The blue squares and 
the green diamonds show the estimated positions of classical dSphs 
and UFDGs, respectively, with labels as in Fig. 2. 


UFDGs as function of the effective parameters. Contours 
of similar surface brightness are approximately diagonal in 
this parameter space. We have marked with red the detected 
synthetic UFDGs with surface brightness dimmer than 30 
mag/arsec^ which is the global SDSS surface brightness 
limit as found in Koposov et al. (2008). Very interestingly, 
the red squares mark out an area in the parameter space of 
UFDGs less bright than the SDSS limit and that would be 
possible to explore with Gaia. Note, nonetheless, that this 
region has a recovery fraction Fjec smaller than 0.8. 

In Fig. 13 (and in Fig. 12) all known satellites of the 
MW lie in an approximate diagonal line in the effective pa¬ 
rameter space. We believe that this is a projection of the 
fundamental curve mass-radius-luminosity studied in e.g. 
Tollerud et al. (2011), or more in detail, a consequence of 
the Faber-Jackson and the rt,-L scaling relations. The plot 
shows that our algorithm would be able to detect objects 
that are outside this diagonal. However, UFDGs that lie be¬ 
low the diagonal have surface brightness brighter than the 
SDSS limit and they would have already been detected, un¬ 
less they are all located outside the SDSS footprint. On the 
other hand, part of the red region with surface brightness 
dimmer than the SDSS limit but that Gaia could probe lies 
outside the diagonal and, therefore, the detection of objects 
in it relies on the existence of them. Note, however, that the 
scatter across the diagonal is large. 

Fig. 14 illustrates the recovered fraction (colour-scale) 
of UFDGs but now in terms of the physical parameters D, 
CTy and Th as a function of My. In these panels, the region 
with fractions between 0.9 and 1 (ocher colours) occupies 
a much smaller portion of the explored ranges. This is be¬ 
cause, in these plots, in any given bin only two physical pa¬ 
rameters are fixed while the remaining are varying in the 
entire explored range, which can result in a very different 
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Figure 14. Fraction Free of detected UFDGs (colour-scale) as a 
function of D (top), cry (middle) and rj, (bottom) versus My. Black 
squares with a central white dot are regions where the fraction re¬ 
covered is exactly equal to 0. The blue squares and the green dia¬ 
monds show the estimated positions of classical dSphs and UFDGs, 
respectively. The black diagonal line in the bottom panel shows the 
SDSS surface brightness limit py = 30 mag/arsec“. 
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Figure IS. Same as low panel of Fig. 14 but only considering 
UFDGs with velocity dispersion cry < lOkms^*. 
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detection significance. This has the effect of lowering 
on average, while also resulting in more diffuse boundaries 
between areas with different as opposed to the sharp 
boundaries seen in the effective parameter plane of Fig. 12, 
which corroborates the fact that our detection scheme does 
depend mainly on the effective parameters. In this figure we 
also show the positions of known UFDGs and classical dSph 
galaxies, though without the number labels. These are shown 
to illustrate the typical recovery fraction one would expect 
for a galaxy with, e.g. a given r* - My, if its velocity disper¬ 
sion and other parameters are unknown but restricted to the 
range spanned by our library. 

This implicit dependence on the other physical param¬ 
eters means that the behaviour of in these plots will 
change depending on the assumed distributions for the dif¬ 
ferent parameters, and so, strictly speaking the reported 
is only valid under the assumed log-uniform distributions. 
For instance, if we consider only UFDGs with small veloc¬ 
ity dispersion (cry < lOkms^'; Fig. 15), the boundaries of 
detection improve significantly, i.e. the algorithm could de¬ 
tect larger UFDGs at the same given My. The simple dis¬ 
tributions assumed for the physical parameters do allow us, 
however, to illustrate the limits of our method. Finally, it is 
worth noticing that the recovered fractions shown in Figs. 12 
and 14 can be interpreted in a probabilistic sense as the prob¬ 
ability that any individual galaxy is detected by our method, 
given two of its physical or effective parameters. 

The diagonal line in the bottom panel of Fig. 14 shows 
the SDSS surface brightness limit of /ry =30 mag/arcsec^. 
Lines of equal surface brightness are diagonal lines with 
slope of 5 in this plot. The detection limits of our algo¬ 
rithm (for example considering the line delineated by the 
red or blue coloured bins) have a similar slope at a slightly 
lower surface brightness with an additional vertical limit at 

My -1.5 (but note that all these depends on the underlying 

distribution of physical parameters). 

The shape of the detection frontiers in the lower panel 
of Fig. 14 is similar to the ones of Koposov et al. (2008) 
in their Figs. 10 and 11. For brighter systems the detection 
limits follows a diagonal line with the slope of a constant 
surface brightness line, followed by a vertical cut at certain 
absolute magnitude. In the case of Koposov et al. (2008), 
the surface brightness and absolute magnitude limits vary 
as function of distance. However, the detection limits of the 
two studies are not directly comparable because our method 
is based on different information, as it includes kinematics. 
Our effective parameter space, where the detection limits are 
defined, is essentially different (with more dimensions). 


4.4 Changing the background 

The results of the previous section correspond to our fiducial 
field (l,b) = (90°, 30°). We now explore how these results 
change for different DOS’s that have a different number of 
stars A^bg and a different distribution of proper motions. 

Let i- be the significance of a detection in the fiducial 
field, in which we have an UFDG with A^obs and a back¬ 
ground with A^bg observed stars. Because A^BG.in A^bg, 
from Eq. 7, it follows that i- cx A^obs/ V-^bg- Then, given 
an UFDG with the same effective parameters but in an ar- 


Table 4. Comparison of the significance of an ensemble of 1000 
UFDGs located at different fields. 


field A field B expected median AMD Fio F 90 
I b I b sb/sa sb/sa 


90 

30 

90 

42 

1.4 

1.2 

0.2 

0.8 

1.9 

90 

30 

90 

55 

1.6 

1.4 

0.3 

0.9 

2.4 

90 

30 

90 

68 

1.9 

1.7 

0.4 

1.0 

3.0 

90 

30 

90 

80 

1.9 

1.9 

0.4 

1.0 

3.2 

90 

42 

90 

55 

1.2 

1.2 

0.2 

0.7 

1.8 

170 

30 

170 

42 

1.3 

1.2 

0.2 

0.8 

1.8 

170 

30 

170 

55 

1.3 

1.4 

0.2 

0.9 

2.2 

170 

30 

170 

68 

1.4 

1.4 

0.2 

0.9 

2.1 

170 

30 

170 

80 

1.4 

1.5 

0.3 

1.0 

2.4 

90 

30 

170 

30 

1.4 

1.4 

0.5 

0.6 

3.3 

90 

30 

170 

42 

1.8 

1.7 

0.6 

0.7 

4.1 

90 

30 

170 

55 

1.9 

1.9 

0.7 

0.8 

4.4 

90 

30 

170 

68 

2.0 

1.8 

0.7 

0.8 

4.5 

90 

30 

170 

80 

2.0 

2.0 

0.8 

0.9 

4.5 


bitrary LOS, for which the number of background stars Alg^ 
has changed in a ratio r = N^q/Nbo, its significance is 

~ ^ (8) 

yjr 

This is a useful relation that allows us to establish the signif¬ 
icance and detection limits in the effective parameter space 
for different DOS’s without running additional experiments. 

For instance, the number of background stars in the 
fiducial field is Nbg = 1413 and for two different DOS’s 
at (/, b) = (90°, 55°) and (1, b) = (90°, 80°) this is A'bg = 525 
and Nbg = 377, respectively. Therefore the background has 
decreased by factors r = 0.37 and r = 0.27, respectively, 
with respect to the fiducial case. Thus, we expect the sig¬ 
nificance of UFDGs with the same effective parameters to 
increase by s’ = Lbs and s' = L9s, respectively. 

In the following we check that this relation is correct. 
We use a library of 1000 UFDGs (randomly extracted from 
the library of Section 4.3), and locate copies of it in dif¬ 
ferent DOS’s. We then compare the one-by-one significance 
for different pairs of DOS’s. Note, however, that because we 
keep the proper motion of each UFDG constant, its relative 
position with respect to the centroid will change depending 
on the LOS (because the underlying distribution changes), 
thus changing one of the effective parameters. For this rea¬ 
son, and also because of the approximations used to derive 
Eq. 8 and that Gaia errors change with LOS, we expect a 
certain dispersion around the values predicted by Eq. 8 . 

Table 4 compares the expected value of Sb! s a with the 
median observed values computed with all 1000 UFDGs at 
different pairs of LOS A and B. We also give the Absolute 
Median Deviation (AMD), and the 10 per cent (Fio) and 90 
per cent (F 90 ) percentiles. The median ratios Sb! s a differ at 
most 0.2 from the expected values. We see also that there 
is some expected dispersion with respect to this value. The 
cases where more dispersion is observed are when we com¬ 
pare fields at different longitudes (last rows). This is because 
in these cases the distribution of background proper motion 
changes the most. However, by looking at the percentiles we 
see that most of the dispersion comes from values that are 
higher than the expected value (thus improving the signif¬ 
icance). The F|o is always around 1. This means that the 


© 2007 RAS, MNRAS 000, 1-21 



18 T. Antoja, C. Mateu, L. A. Aguilar, et al. 


significance of all the fields B is smaller than predicted, but 
larger than the significance of the fields A in ~ 40 per cent 
of the cases. For ~ 50 per cent of the cases the significance 
of all the fields B is larger than expected. 

In conclusion, the significance of the detections of our 
fiducial field at / = 90° and b = 30° are maintained or im¬ 
proved in ~ 90 per cent of the simulated cases in the other 
fields where Abo was smaller. The boundaries of the detec¬ 
tion will also improve for these fields. But the scaling be¬ 
tween significance and fraction of recovery is not straight¬ 
forward and one would need to evaluate this in each partic¬ 
ular field. Eq. 8 offers, however, a fast approximate way of 
comparing the success of the detections in different LOS’s. 


5 CAVEATS 

The method that we have introduced here has its limitations 
and assumptions, which we will review here. 

First of all, we emphasize that our method does not 
aim to characterize and study UFDGs but it is a probabilis¬ 
tic method to identify possible candidates. When applied to 
real data, it will provide us with a list of candidates that will 
need to be studied in detail. The colour-magnitude diagrams 
of Gaia photometry can be used for this, as well as to derive 
morphological properties, kinematics, distances, etc, once a 
proper filter to select the UFDG population is designed, as 
has been done with SDSS (Willman et al. 2002). Also, a fol¬ 
low up using ground-based facilities will be required to ob¬ 
tain radial velocities and detailed chemical abundances. 

Our procedure has been tested for Galactic latitudes 
above 30° in fields of 2° x 2° and has been designed for 
UFDGs at distances larger than 10 kpc. If we want to apply 
it to search for nearer systems, dilferent parallax and logg 
cuts would be needed. However, larger scales in the sky 
would have to be probed, increasing the background level. 
Another limitation of the method is that it is optimized for 
UFDGs with angular sizes smaller than the 2° x 2° fields, 
and the detection of larger systems would require, again, to 
probe larger scales in the sky and perhaps a different strat¬ 
egy- 

Likewise, it has been tested for UFDGs modeled as 
Plummer spheres with isotropic velocity distributions where 
light follows mass. A change in these assumptions that re¬ 
sults on a variation in the footprints in the sky or proper 
motion planes, will change the effectiveness of our method, 
although the limits we have encountered should remain the 
same, when expressed in terms of the effective parameters. 
It is the mapping from structural to effective parameters that 
would need to be established for the new UFDG models. 
Similarly, the boundaries that define the limits of our detec¬ 
tion method in N„i,s will simply map into different bound¬ 
aries in stellar luminosity, if the stellar population content is 
changed. 

Our background clearly comes from smooth distribu¬ 
tions without streams or clouds. A clumpy halo may af¬ 
fect the number of background stars compared to our esti¬ 
mations with GUMS. However, our algorithm will also de¬ 
tect other systems that are not necessarily UFDGs as long 
as they present some coherence in the 4-dimensional space 
that we use. These detections, rather than being considered 


as additional false positives, will be interesting systems to 
be followed-up. 

Previous studies to detect UFDGs with photometric 
surveys apply an isocrhone masking or a probabilistic mod¬ 
elling in the colour-magnitude or colour-colour diagram in 
order to filter out field stars (e.g. Koposov et al. 2008). In¬ 
stead, here we do cuts using parallax and surface grav¬ 
ity. Some preliminary tests show that the addition of an 
isocrhone masking in the Gaia G vs. Gbp - Grp plane in our 
algorithm may be beneficial in particular cases. This merits a 
separate investigation that we aim to undertake in the future. 

Although we have not included unresolved galaxies and 
quasars in our simulated background, we have checked that 
these will have a minor effect in our results. According to 
Bailer-Jones et al. (2013) (their table 3) the fraction of mis- 
classified galaxies and quasars is 2.5% (2% misclassified as 
stars and 0.5% as binary systems) and 8.9% (5.9% as stars 
and 0.1% as binary systems), respectively. From GUMS 
simulations we have estimated that the number of galaxies 
and quasars in our fiducial field would be 4051 and 120, re¬ 
spectively. Therefore, there will be 110 objects (mainly 
galaxies) misclassified as stars, corresponding to an increase 
of 7% of the back/foreground population in our fiducial field, 
and up to ~ 30% for other LOS ((I, fc) = (180°, 80°)). How¬ 
ever, this increase in the number of field stars will imply 
approximately a change in the significance of the detections 
only of a factor 0.96 and 0.88 (Eq. 8), respectively, in the 
two LOS described. 

One important aspect of the Gaia astrometric data that 
we have not taken into account in this work is the issue of 
covariances in the estimated astrometric parameters. As ex¬ 
plained in Lindegren et al. (2012), the statistical correlation 
between the different astrometric parameters will occur be¬ 
tween the parameters of the same source and also between 
the parameters of different sources. The within-source error 
covariances can be similar for collections of sources in small 
areas of the sky, as can be seen for example in the statistical 
plots in Volume 1 to the Hipparcos Catalogue documenta¬ 
tion (ESA 1997). In the proper motion plane for small ar¬ 
eas on the sky (such as used in this study) this can lead to 
apparent structure in the proper motion distribution (caused 
by elongated and preferentially oriented error-ellipses).The 
between-source covariances will have a similar effect and 
are estimated in the case of Gaia to be most pronounced over 
areas of order 0.3° radius on the sky (the value of the cor¬ 
relation half-length estimated in Holl, Hobbs & Lindegren 
2010). This means that for a large fraction of UEDGs the 
between-source correlations will be important in addition to 
the within-source correlations. To first order the main effect 
will be that the interpretation of the WT maps will be more 
involved, where a distinction will have to be made between 
real and spurious structure in the proper motion plane. 

The within-source covariance matrix of the astromet¬ 
ric parameters will be provided as part of the Gaia data 
releases. The covariance matrix of the astrometric param¬ 
eters of different sources cannot be calculated for the full 
Gaia catalogue but it is feasible to do so for limited groups 
of sources as demonstrated in Holl & Lindegren (2012) and 
Holl, Lindegren & Hobbs (2012). Hence we will be able to 
account for the error covariances but we defer to a future 
study the details of how to implement this in practice. 
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6 DISCUSSION AND CONCLUSIONS 

We have introduced an automatic procedure to identify 
UFDG candidates in the future Gaia database and charted 
its detection limits. The main advantages of using Gaia data 
on the search of UFDGs are, first, the inclusion of kinemat¬ 
ics (proper motions) in the detection algorithm for the first 
time; and second, the Gaia full sky coverage, being the first 
unbiased homogeneous survey to be used for this purpose. 

Our procedure identihes significant overdense peaks in 
the planes of the sky and of proper motions that share com¬ 
mon stars. Then the probability of this occurring by chance 
is assessed and used to discard spurious detections. We have 
used a library of ~ 30 000 synthetic UFDGs to probe the 9- 
D space of intrinsic {Ly, r*, cry) and extrinsic (I, b, D, Fgai, 
0y, 9y) UFDG parameters, spanning ranges that extend well 
beyond those occupied by currently known systems. 

We have identified the “effective parameters” that our 
algorithm depends mainly on. The main two are the ratios 
of the number of observable stars by Gaia in the UFDGs to 
their apparent sizes in the sky {Noi,J9) and proper motion 
planes (NobJa-ti)- The position of the peak in proper motion 
with respect to the background also influences the detection, 
but is not as relevant. These parameters reduce the dimen¬ 
sionality of our problem to 3, mainly 2, parameters. 

We have charted the limits of detectability and com¬ 
pleteness (recovery fraction) of our search in the effective 
parameter space (Fig. 12) for a LOS at / = 90° and b = 30°. 
Detections can be made with high significance over most of 
the explored region, which includes the majority of the cur¬ 
rently known UFDGs, with a recovered fraction that remains 
above 70 per cent over most of it. It is only in the corner of 
small effective parameters that the efficacy of our method 
decreases abruptly. On the other hand, the limits of our de¬ 
tection procedure can not be described in terms of a limiting 
surface brightness alone (Fig. 13), because of the inclusion 
of kinematics in the search. 

We have derived a relation that allows us to know 
the approximate detection significance of the synthetic 
UFDGs at LOSs with a different number of background 
stars. The translation from significance to recovery fraction 
is not straightforward and one would need a more thorough 
characterization per LOS. However, most of the results pre¬ 
sented here are for a pessimistic case compared to higher 
latitudes, or to the outer galaxy (I = 180°), where we expect 
less field contamination. 

Furthermore, we have explored the extent to which cur¬ 
rent detectability limits can be pushed forward, opening the 
possibility of detecting real systems hitherto not found. We 
have found that there is a region in the effective parameter 
space where there are currently no observed systems. Part 
of this region corresponds to UFDGs with surface bright¬ 
ness brighter than the SDSS limit and, therefore, they would 
have already been detected, unless they are all located out¬ 
side of the SDSS footprint. But more interestingly, we have 
seen that Gaia will be able to probe a region of the effec¬ 
tive parameter space of surface brightness dimmer than the 
SDSS limit, if such objects exist, albeit with a recovery frac¬ 
tion smaller than 0.8. Note that the recent UFDG discover¬ 
ies made with DECam have similar surface brightness to the 
ones detected by SDSS (see Fig. 17 in Koposov et al. 2015). 
Also because of the different detection methodologies fol¬ 


lowed by SDSS and DECam compared to Gaia, the nature 
of the detection limits is completely different, thus offering 
the possibility to explore uncovered regions of the parameter 
space (both with respect other surveys in the north and in the 
south) and for all sky. 

We can make a very rough estimation of the number 
of UFDGs that Gaia will detect from the recovery fractions 
that we have found for our synthetic search (Fig. 12), assum¬ 
ing isotropy on the distribution of satellites in the MW halo, 
and considering only SDSS UFDGs. There is 1 known ob¬ 
ject (Boo) that would be detected with a recovery fraction of 
1.0, 4 objects (CVnI, Her, UMall, CmB) with a fraction of 
0.9, and 4 objects (Will, UMa, Segll, BooII) with fractions 
of 0.8, 0.7, 0.6 and 0.5, respectively. We do not count ob¬ 
jects with a recovery fraction below 0.5. This makes a total 
of 7.2 UFDGs in a sky area equivalent to SDSS (~ 1/5 of the 
sky, Koposov et al. 2008). If we assume that Gaia will detect 
UFDGs only above b = 30°, which corresponds to 1/2 of the 
sky, there should be of the order of ~ 10 new UFDGs (i.e. 
currently not known) over the 1/2-1/5=3/10 of the sky that 
remains unexplored, that is subtracting the area already cov¬ 
ered by SDSS. These calculations are based on the field at 
/ = 90° and b = 30° but could be slightly better for higher 
latitudes. 

However, by the arrival time of the Gaia catalogue (see 
below) other surveys such as ATLAS (Shanks et al. 2013), 
Pan-STARRS (Kaiser etal. 2010) and DES (Diehl et al. 
2014) will have covered great fraction of this area. But a part 
of the South Galactic cap will still remain completely unex¬ 
plored (at declinations below stripe SPT of DES). We esti¬ 
mate this to be a fraction of ~ 0.0195 of the whole sphere (by 
taking the part of the spherical cap in equatorial coordinates 
below S < -65° that lies in the range a ~ [-60,90]). There¬ 
fore, there should be of the order of ~ 1 new UEDG in this 
unexplored area. However, we emphasize that our method 
uses information not used in other searches, namely proper 
motions, and thus, it could lead to new discoveries, made 
possible, not by the covered region in the sky or depth of 
probing, but by their motion in the sky. As such, our method 
complements present searches. 

Moreover, the number of discovered new candidates 
could be higher because as discussed above, Gaia could also 
detect more UFDGs with lower surface brightness than the 
SDSS limit. Besides, under the assumption of anisotropy in 
the spatial distribution of satellites, this number could be 
larger if the Gaia footprint happens to cover preferential di¬ 
rections. In fact, the importance of having a full-sky cata¬ 
logue in this type of search for the first time is that it will 
allow to put constraints on the isotropic distribution of the 
satellites and, therefore, their origin. 

The known UFDGs with high recovery fraction men¬ 
tioned above could be seen as standard systems for future 
Gaia discoveries but only in terms of effective parameters. 
Thus, one can not interpret this as if, for instance, all objects 
with the same half-light radius and the same distance as Boo 
will be detected, but rather as systems for which the combi¬ 
nation of all physical parameters produce similar elfective 
parameters will be detected with high probability. Note also 
that we have not considered in this calculation the influence 
of the third effective parameter, which we have shown to be 
less important. 

Our proposed method can be applied fully to the third 
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Gaia data release scheduled for'® 2017/2018. This release 
will include the five-parameter astrometric solutions as well 
as the object classification (necessary to eliminate contam¬ 
inant extra-Galactic objects) and astrophysical parameters 
such as log g, necessary for filtering out foreground dwarfs. 
Preliminary searches could be conducted using earlier re¬ 
leases; e.g. with the first data release in summer 2016, using 
only on sky coordinates; or with the second data release in 
early 2017, using full sky and proper motion information, 
yet without the possibility of using the foreground filters as 
explained here, since astrophysical parameters will not yet 
be available. 

Finally, there is the future possibility that the Gaia mag¬ 
nitude limit will be pushed down to G = 20.7. This will 
obviously be positive in terms of the number of observ¬ 
able stars in each UFDG, but will also increase the fore¬ 
ground/background contamination, so the effect in the de¬ 
tection probabilities will have to be assessed. 
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